System for handling exceptions occurring during parallel execution of microinstructions

ABSTRACT

A method of handling a fault associated with a first floating point instruction upon reaching the next sequential floating point instruction is described. The first floating point instruction is decoded. A first floating point microinstruction received from a control memory is stored in a first latching means and in a second latching means. The next sequential floating point instruction is decoded. There is a jump to a plurality of exception handler microinstructions stored in the control memory, the jump occurring upon the detection of the fault associated with first floating point instruction. The plurality of exception handler microinstructions includes an exception handler floating point microinstruction. The exception handler floating point microinstruction received from the control memory is stored in the first latching means, replacing the previous microinstruction stored in the first latching means. The exception handling floating-point microinstruction received from the control memory is not stored in the second latching means. The exception handler floating point microinstruction stored in the first latching means is executed. The floating point microinstruction stored in the second latching means is executed. A method for allowing floating point instructions to be executed in a microprocessor in parallel with non-floating point instructions is also described. Circuitry allowing floating point instructions to be executed in parallel with non-floating point instructions is also described.

This is a continuation of application Ser. No. 07/880,133, filed May 6,1992, now abandoned, which is a continuation of application Ser. No.07/298,520, filed Jan. 8, 1989, now issued U.S. Pat. No. 5,134,693,entitled "System for Handling Occurrence of Exceptions ofMicroinstructions while Running Floating Point and Non-Floating PointInstructions in Parallel" by Avtar Saini.

FIELD OF THE INVENTION

The present invention pertains to the field of floating pointinstruction execution in a microprocessor. More particularly, thisinvention relates to the execution of floating point instructions inparallel with non-floating point instructions.

BACKGROUND OF THE INVENTION

To represent a large dynamic range of numbers with relatively few bits,floating point representation can be used to explicitly encode a scalefactor in each number. A floating point number includes a mantissa, anexponent, and sign bit that indicates the sign of the mantissa. Incontrast, integer instructions, and other non-floating pointinstructions, typically do not include exponent bits. Examples offloating point numbers include (1) single precision floating point realnumbers, (2) double precision floating point real numbers, and (3)extended precision floating point real numbers.

A computer instruction written in a floating point format typicallyrequires more processor clock cycles to complete than a correspondinginstruction written in an integer or non-floating point format. Forexample, instructions requiring the addition, subtraction,multiplication, or division of floating point numbers each require theexecution of an algorithm with several steps. One of the steps is thenormalization of the result. A non-zero floating point number isnormalized if the left-most bit of the mantissa is non-zero. Thenormalized representation of zero is all zeroes. A denormalized numberis a number not in the normalized format.

A computer can have integer instructions and floating point instructionsintermixed. For example, a series of integer instructions can follow afloating point instruction. As discussed above, floating pointinstructions typically take longer to execute than integer instructions.For example, an integer ADD instruction typically takes one clock cycle.On the other hand, a floating point ADD instruction typically takes 8 to10 clock cycles to complete. An integer LOAD instruction typically takesone clock cycle to complete. On the other hand, a floating point LOADinstruction typically takes multiple clock cycles to complete. Moreover,typically 25 to 30 percent of all instructions in a work stationenvironment are floating point instructions.

In one past approach, floating point instructions are handled by aseparate chip such as the 80387 80-bit CHMOS 111 Numeric ProcessorExtension sold by Intel Corporation of Santa Clara, California. Integerinstructions, however, are handled by a separate main microprocessor,such as the 80386 32-bit CHMOS microprocessor sold by Intel Corporation.

The 80387 is a co-processor. The 80386 microprocessor decodes a floatingpoint instruction and passes to the 80387 all the relevant informationfrom the floating point instruction needed by the 80387 to execute thefloating point instruction. Once that information is passed from the80386 to the 80387, the 80386 can proceed to execute any subsequentinteger instructions until the 80386 reaches the next floating pointinstruction. The 80387 has its own control read-only memory ("ROM") andcontrol logic, and the 80386 in turn has its own control ROM and controllogic.

Thus, with the prior two chip 80387 and 80386 approach, floating pointinstructions are executed in parallel with non-floating pointinstructions. The passing of relevant floating point information betweenthe 80386 and the 80387 imposes a significant performance penalty,however, from an overall system performance point of view in the form ofinterface overhead.

In some other prior approaches, a floating point unit is placed on thesame chip as the microprocessor. This removes the interface overheadthat would otherwise occur if the floating point unit was on a separatechip. In those past approaches that put the floating point unit on thesame chip as the microprocessor, the floating point instructions are notexecuted in parallel with the non-floating point instructions, however.Instead, all instructions are executed sequentially. In other words, themicrocomputer waits for the execution of a floating point instructionbefore moving on to the next instruction. Therefore, although someperformance is gained by removing interface overhead, some performanceis lost because of the lack of parallelism.

Furthermore, although the floating point unit is on the microprocessorchip in those past non-parallel approaches, the floating point unitnevertheless has its own control ROM and control logic, and themicroprocessor in turn has its own separate control ROM and controllogic.

If the floating point unit could be placed on the microprocessor chip ina way that floating point instructions could be executed in parallelwith integer instructions, there would be a gain in performance. One wayto do this might be to introduce parallelism but yet have two separatemicrocoded control ROMs--namely, a floating point control ROM and aninteger control ROM.

One disadvantage of this multiple control ROM approach is that it wouldrequire the duplication of the control logic--there would need to becontrol logic at the periphery of each control ROM.

Another disadvantage of the multiple control ROM approach is that itwould add to the complexity of the "who is in charge" decision.

A further disadvantage of this multiple control ROM approach is that itwould require additional hardware to allow the sharing of resources, andsuch hardware would be complex and take additional space in silicon. Forexample, if a floating point execution unit and an integer executionunit were to operate at once, those units might need to share the samebus or the same addressing unit. Complex circuitry would be required tooversee such sharing of resources.

Another consideration with respect to floating point units is thatexception conditions must be handled somehow, regardless of whether aparallel or non-parallel approach is used. Although the priornon-parallel instruction execution method imposed a performance penalty,the handling of exception conditions is nevertheless a straightforwardtask if there is no parallelism. Exceptions are handled as soon as theyarise if there is no parallelism.

Examples of those exceptions are invalid operation, denormalizedoperand, zero divisor, overflow, underflow, and inexact result in termsof precision. Microcode is used to assist the hardware in handling theexceptions. The exceptions are divided into two types of problems: (1)pre-execution assist and (2) post-execution fault. With pre-executionassist, the microcode corrects the problem before execution is finished.With post-execution faults, the problems are corrected after instructionexecution.

SUMMARY AND OBJECTS OF THE INVENTION

In view of limitations of known systems and methods, one of theobjectives of the present invention is to provide an improved method andcircuitry for allowing floating point instructions to be executed in amicroprocessor in parallel with non-floating point instructions.

Another objective of the present invention is to provide an improvedmethod of handling an exception associated with the first floating pointinstruction upon reaching the next sequential floating pointinstruction.

These and other objects of the invention are provided for by a method ofhandling an exception associated with a first floating point instructionupon reaching the next sequential floating point instruction. The firstfloating point instruction is decoded. A first floating pointmicroinstruction received from a control memory is stored in a firstlatching means and in a second latching means. The next sequentialfloating point instruction is decoded. There is a jump to a plurality ofexception handler microinstructions stored in the control memory, thejumping occurring upon the detection of the exception associated withthe first floating point instruction. The plurality of exception handiermicroinstructions include an exception handier floating-pointmicroinstruction. An exception handler floating point microinstructionreceived from the control memory is stored in the first latching means,replacing the previous microinstruction stored in the first latchingmeans. The exception handler floating point microinstruction is notstored in the second latching means. The exception handler floatingpoint microinstruction stored in the first latching means is executed.

The above-mentioned objects and other objects of the invention are alsoprovided for by circuitry in a microprocessor allowing floating pointinstructions to be executed in parallel with non-floating pointinstructions. The circuitry includes means for decoding floating pointinstructions and non-floating point instructions. The circuitry alsoincludes a control memory coupled to an output of the decoding means andincluding (1) a plurality of exception handler microinstructionsincluding a floating point microinstruction and a non-floating pointmicroinstruction, (2) a non-exception handler floating pointmicroinstruction, and (3) a non-exception handler non-floating pointmicroinstruction, wherein the control memory is for microprocessorcontrol. The circuitry also includes means for executing a non-floatingpoint microinstruction received from an output of the control memory.The circuitry also includes a latching unit coupled to the output of thecontrol memory and in parallel with the non-floating point executionmeans, wherein the latching unit is for (1) storing a floating pointmicroinstruction from the control memory in a first latching means,wherein the floating point microinstruction can be an exception handlerfloating point microinstruction and a non-exception handler floatingpoint microinstruction, and for (2) storing a non-exception handlerfloating point microinstruction from the control memory in a secondlatching means. The circuitry also includes means for executing afloating point microinstruction received from an output from thelatching unit.

Other objects, features, and advantages of the present invention will beapparent from the accompanying drawings and from the detaileddescription which follows below.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and notlimitation in the figures of the accompanying drawings, in which likereferences indicate similar elements and in which:

Fig. I is a block diagram of the architecture of a microprocessor;

FIG. 2 is a block diagram of the architecture of the floating point unitof the microprocessor;

FIG. 3 is a block diagram of interface circuitry;

FIG. 4 is a block diagram of a microcode latching unit; and

FIG. 5 illustrates a sequence of microinstructions and exception handlermicroinstructions; and

FIG. 6 is a flow chart illustrating parallel operation ofmicroinstructions and exception handling.

DETAILED DESCRIPTION

FIG. 1 is a block diagram of the architecture of a 32-bit microprocessor60 that includes a floating-point unit 78. In a preferred embodiment ofthe present invention, floating point unit 78 is not on a separate chip,but is instead part of microprocessor 60. Floating point unit 78contains the logic to execute the floating point instruction setassociated with microprocessor 60. Floating point unit 78 can executefloating point instructions at the same time that other parts ofmicroprocessor 60 are executing non-floating point instructions, andthus there is parallel operation. Floating point unit 78 is described inmore detail below.

Microprocessor 60 is organized as follows. Bus interface unit 62 isresponsible for fetching data from an external memory system (notshown). Bus interface unit 62 is also responsible for updating theexternal memory system when there is a write. Bus interface unit 62provides the necessary interfaces between XA bus 61, XD bus 63, cacheunit 66, and prefetcher unit 64. XA bus 61 is a 32-bit bus coupledbetween the external memory of microprocessor 60 and bus interface unit62 for sending and receiving addresses. XD bus 63 is a 32-bit buscoupled between the external memory and bus interface o unit 62 forsending and receiving data. Bus interface unit 62 is coupled to cacheunit 66 via KBA bus 65, (also referred to as Address bus 65) which is abidirectional address bus, and KBWR bus 67 (also referred to as Data bus67). Bus interface unit 62 receives data from cache 66 via KBWR bus 67.Bus interface unit 62 sends data to prefetcher unit 64 via KBRD bus 69.

R bus 138, M bus 140, and LA bus 89 provide the main data path formicroprocessor 60. R bus 138 couples cache 66 to data path unit 76 andfloating point unit 78. M bus 140 couples cache 66 with segmentationunit 74, data path unit 76, and floating point unit 78. LA bus 89couples cache 66 with paging unit 72 and segmentation unit 74.

Data path unit 76 is the main execution data path. Data path unit 76contains the arithmetic logic unit ("ALU"), a register file, a barrelshifter, a constant ROM, a machine status word ("MSW"), and flags.

Segmentation unit 74 implements the segmentation part of the overallmemory management model. I bus 90 couples data path unit 76 tosegmentation unit 74. K2Q bus 73 couples prefetcher unit 64 withsegmentation unit 74, and is a 32-bit bus.

Paging unit 72 implements a two-level paging mechanism of the overallmemory management model. PA bus 75 is a 22-bit bus that couples pagingunit 72 with cache 66.

Prefetcher unit 64 is responsible for supplying decode unit 68 withmacroinstructions via KIQ bus 71.

Instruction decode unit 68 is responsible for decoding the incomingmacroinstructions for microprocessor 60. Instruction decode unit 68 iscoupled to control unit 70 via IWORD bus 77 and Entry Point bus 79.

Microprocessor 60 uses pipelined instructions. The instruction pipelineincludes (1) a prefetch stage, (2) an instruction decode stage, (3) anexecution stage, and (4) a write-back stage. With pipelining, when adecoded macroinstruction goes on to the execution stage, the instructiondecode stage can begin for the next macroinstruction.

Control unit 70 is the microcode engine of microprocessor 60. Controlunit 70 contains microcoded control ROM 92 and control logic thatdirects the actions of the other units of microprocessor 60, such asfloating point unit 78. Instruction decode unit 68 sends entry pointinformation to microcoded control ROM 92 over entry point bus 79 toindicate the location in control ROM 92 of the first microinstruction ofa group of one or more microinstructions to be executed. Themicroinstructions themselves provide information to control unit 70 forcontrol unit 70 to determine how many microinstructions are to beexecuted for a given entry point.

The control ROM microinstructions, once executed, provide control formicroprocessor 60 and floating point unit 78.

The control unit handles most of the freeze conditions, such as when theFbusy signal is on, as described below. Control unit 70 also containstest logic. The test logic of control unit 70 provides the propermicrocode ROM vectoring for floating point exceptions.

Control unit 70 sends microinstructions to segmentation unit 74 viabuses 81 and 83. Control unit 70 sends microinstructions to data pathunit 76 via buses 81 and 85. Control unit 70 sends microinstructions tofloating point unit 78 via buses 81 and 87.

The floating point microinstructions share the same microcode ROM 92 asthe integer microinstructions. The floating point microinstructions takeadvantage of available early start actions and address calculationdirected by instruction decoder 68. Once the necessary set-up iscomplete, floating point unit 78 is capable of executing arithmeticoperations on its own, allowing the rest of microprocessor 60 to befreed up for other non-floating point operations. In other words, anon-floating point instruction can be executed in parallel with afloating point microinstruction.

FIG. 2 is a block diagram of the architecture of floating point unit 78.Floating point unit 78 is concerned with the execution of basicarithmetic floating point operations. The instruction decode, addresscalculation, and control functions for floating point instructions are,however, carried out by the portion of microprocessor 60 outside offloating point unit 78 as an extension of the integer instruction setcharter for microprocessor 60.

Floating point unit 78 deals with faults generated off of a floatingpoint microinstruction in the next sequential floating-pointmicroinstruction. Non-floating point microinstructions can, however, beexecuted between the initial floating point microinstruction and thenext sequential floating point microinstruction. In other words, theportion of microprocessor 60 outside of floating point unit 78 canexecute interleaved non-floating point microinstructions while floatingpoint unit 78 is operating on the current floating pointmicroinstruction.

Floating point data is transferred between cache 66 (of FIG. 1) andinterface unit 101 of floating-point unit 78. Floating point data couldbe received by interface unit 101 either from (1) M bus 140 or (2) bothR bus 138 and M bus 140. Interface unit 101 is responsible for movingdata between (1) R bus 138 and M bus 140 and (2) mantissa latch 313 andexponent latch 311, as described below in connection with FIG. 3.

FP interface data path unit ("Fint unit") 102 and FP interface controlunit (Fintr unit") 104 are parts of FP interface unit 101. Fint unit 102contains the data path for FP interface unit 101. Fintr unit 104 decodesincoming microinstructions or microcode and issues control signals toFint unit 102.

FIG. 3 illustrates Fint unit 102 in more detail in block diagram form.Incoming data is latched into MRhigh latch 303 and MRlow latch 305.Outgoing data is latched in Mout latch 307. Multiplexer 301 permitsMRhigh latch 303 to get data from either M bus 140 or R bus 138,depending on the number of bits being transferred to floating point unit78. Shift array 309 separates and aligns the incoming data from MRhighlatch 303 and MRlow latch 305 (via lines 326 and 328) into mantissa andexponent parts. Incoming data could be, for example, single, double, orextended precision floating point real numbers. The output from shiftarray 309 is latched into mantissa latch 313 and exponent latch 311 (vialines 332, 334, and 336). Exponent latch 311 is coupled to exponent EXAbus 182 and sign bit line 338. Mantissa latch 313 is coupled to mantissaAbus 146.

Special number detect circuitry 315 is coupled to exponent EXA bus 182via lines 340. Special number detect circuitry 315 is also coupled tomantissa Abus 146 via lines 342. Special number detect circuitry 315recognizes special cases of floating point operands when data istransferred on Abus 146 and EXA bus 182. Those special cases includeoperands that are not a number ("Nan"), denormalized numbers, infinity,and zero.

When a special number is detected, then special number signals are sentvia lines 344 to FP control unit ("Fconr unit") 217 of FIG. 2, whichthen sets the appropriate bits in trap register 367 of FIG. 2. Thus,logic in Fint unit 102 detects cases such as the mantissa and exponentof a floating point number being all zeroes, but final determination ofspecial numbers is done in Fconr unit 217.

Although Fint unit 102 is part of floating point unit 78, as shown inFIG. 2, Fint unit 102 appears to the microcode to be an extensionoutside of floating point unit 78. For example, while the floating pointunit 78 is working on a multi-clock floating point instruction, themicrocode can run bus cycles and get data for the next sequentialinstruction loaded into latches 303 and 305 of Fint unit 102 ofinterface unit 101. This is made possible by extra decode circuitry inFintr unit 104. Fintr unit 104 has its own parallel set of microcodelines 224 for receiving control microinstructions from FP microcodelatching unit (Fmicro unit") 220, described below. Fintr unit 104decodes the microcode information and then sends out control signals toFint unit 102.

The mantissa data path is shown on the right side of FIG. 2. Themantissa data path comprises accumulator 105, mantissa adder 107,operand registers 109 (which includes registers OPA 375, OPB 377, andOPC 379), shifter 111 (which includes register SReg 381), mantissamultiplier 113, and mantissa ROM 115. Control for the mantissa data pathcomes from control logic 119.

Stack unit 103 contains registers forming a stack. Stack unit 103 iscoupled to both Abus 146 and EXA bus 182, and is thus part of bothmantissa and exponent paths.

The exponent data path is shown on the left side of FIG. 2 and comprisesexponent adder 123, multiplexer 125, exponent register EXA 127, exponentregister EXB 129, and exponent ROM 131. The exponent data path alsoincludes EXA bus 182 and EXB bus 192. All control and random logic forthe exponent data path is contained in Fconr unit 217, FP PLA unit("Fpla unit") 227, and FP decoder unit ("Fdecr unit") 223 of controllogic unit 119.

Exponent register EXB 129 can be examined to detect underflow andoverflow fault conditions. Fconr unit 217 monitors EXB register 129 andmakes the decision as to whether the fault is underflow or overflow.

Trap logic circuitry 121 of floating point unit 78 comprises FP trapunit ("FP trap unit") 122 and FP trap random logic ("FtrapR") 124. Ftrapunit 122 includes latches and bus drivers. FtrapR unit 124 includesperipheral random logic.

Ftrap unit 122 includes the following user-visible 16-bit registers:control word ("CW") register 361, status word ("SW") register 363, andtag word ("TW") register 365.

The control word in CW register 361 is a user-defined specification asto how the user would like the instructions executed. For example, CWregister 361 includes information about which, if any, of the exceptionsshould be masked.

The status word in register 363 includes information about any unmaskedexceptions that were encountered during execution of an instruction.Both the control word and the status word can be read from and writtento R bus 138 by the microcode.

To allow an instruction to be restarted after the detection of a fault,temporary copies of the status word and tag word are used duringoperation. The temporary copy of the status word ("TSW") is stored inTSW register 369. The temporary copy of the tag word ("TTW") is storedin TTW register 371. After the point during instruction execution whenthe generation of an exception is no longer a possibility, TSW is placedin SW register 363 and TTW is placed in TW register 365. This copying ofthe temporary copies of the status and tag words into respectiveregisters SW 363 and TW 365 normally happens on the last clock of theinstruction, which is referred to as "LMI." In contrast, "CNEWI" is thefirst microinstruction for a new macroinstruction. Floating point unit78 uses the CNEWI signal to copy what is stored in status word register363 into TSW register 369 and to copy what is stored in tag wordregister 365 into TTW register 371.

Ftrap unit 122 also includes trap register 367. Trap register 367accumulates information during the course of executing an instructionand is cleared upon the last clock of the instruction if no faults areencountered during the course of instruction execution. Trap register367 of Ftrap unit 122 has the information required by the exceptionhandler for branching to the correct area in the exception handler code.That information includes an error summary bit, bits indicating stackunderflow fault or stack overflow fault, a bit indicating an exponentunderflow fault or overflow fault, and a bit indicating the sign ofaccumulator 105. Said bit indicating the sign of accumulator 105 alsosignifies mantissa overflow fault. Trap register 367 can be read fromand written to R bus 138.

Control logic 119 receives microinstructions on lines 87 from controlcircuitry 70 (of FIG. 1 ). Control logic 119 in turn provides controlfor floating point unit 78. Control logic unit 119 includes Fmntr unit218, Fconr unit 217, Fmicro unit 220, Fdecr unit 223, and Fpla unit 227.Fpla 227 is also referred to as nanosequencer 227.

Control logic 119 also sends out Fbusy and Ferror signals on lines 212.Control logic 119 is coupled to trap circuitry 121 via lines 180.Control logic 119 is coupled to shift count value ("SCVAL") bus 160.Control logic 119 is coupled to accumulator 105 via lines 172. Controllogic 119 is also coupled to loop counter 117 via lines 176.

Microcode incoming on lines 87 to control logic 119 is latched in Fmicrounit 220 and decoded in Fdecr unit 223. All signal clockmicroinstructions are executed as control lines coming out of Fdecr unit223.

For microprocessor 60 to be able to execute one or more non-floatingpoint instructions in parallel with a multi-clock floating pointinstruction, floating point unit 78 needs to latch the microcode ormicroinstruction information and use it all through the execution of thefloating point microinstruction. But this one level of latching is notenough to handle the floating point exceptions. In the event thatfloating point unit 78 raises a floating point error, the microcodejumps to the exception handler microinstructions stored in control ROM92 (of FIG. 1 ). The exception handler subroutine may need to issue amulti-clock microinstruction to floating point unit 78, and at a laterpoint request floating point unit 78 to reexecute the original floatingpoint microinstruction. To make such a restart possible, Fmicro unit 220has a second latch, described below, wherein only the originalinstruction is latched. Latching is disabled when a floating point erroris encountered, to be resumed only after the reexecute command--called"INDIRECT₋₋ EXECUTE"--is issued, or in the case where the microcode doesnot want to come back to this instruction, a "NEWI" instruction isissued, which indicates the start of a new macroinstruction.

FIG. 4 illustrates in block diagram form the architecture of Fmicro unit220. Incoming microcode is latched in input latch 251. Input latch 251receives the microcode via lines 87. Input latch 251 is clocked by thephase one clock via lines 242. For timing reasons, the microcodesignals, although arriving in phase one, are coming off a phase twolatch. Therefore, the first thing floating point unit 78 does is tolatch the microcode in input latch 251 and apply the phase one clock vialines 242. Following this input latch 251, there is a three-to-onemultiplexer 253, which is 27-bits wide. Following multiplexer 253 is anintermediate latch 255 that is docked by the phase two clock via lines244. Lines 222 couple input latch 251 to the three-to-one multiplexer253. Lines 222 and 224 couple the output of input latch 251 to Fintrunit 104. Lines 232 couple the output of multiplexer 253 to the input tointermediate latch 255. Lines 232 and 230 couple the output ofmultiplexer 253 to the floating point unit decoders, Fdecr 223 and Fconr217.

The output of intermediate latch 255 is coupled to the input ofexecution latch 257 and the input of indirect execute latch 259. Lines234 and 236 couple intermediate latch 255 to execution latch 257. Lines234 and 238 couple intermediate latch 255 to indirect execute latch 259.Execution latch 257 is clocked by the phase one clock via lines 246.Gate 261 is an AND gate that performs a logical "AND" of a "latch-OK"signal received from Fconr unit 217 and a phase one clock signal. Theoutput of gate 261 is applied as a signal to indirect execute latch 259via line 248. The output of execution latch 257 is coupled to the inputof multiplexer 253 via lines 226. The output of indirect execute latch259 is coupled to the input of multiplexer 253 via lines 228.Multiplexer 253 receives control signals DECNEW, DECOLD, and DECUIL vialines 240 from Fconr unit 217. Signal DECNEW causes multiplexer 253 topass the output of input latch 251 to the input of intermediate latch255. Signal DECOLD causes multiplexer 253 to pass the output ofexecution latch 257 to the input of intermediate latch 255. SignalDECUIL causes multiplexer 253 to pass the output of indirect executelatch 259 to the input of intermediate latch 255.

Indirect execution latch 259 retains a copy of the original floatingpoint microinstruction. Execution latch 257, however, can contain themicroinstruction that is received during a jump to the exception handlermicroinstructions.

Returning to FIG. 2, floating point controller 387 controls datamovement and operations within floating point unit 78. Floating pointcontroller 387 comprises Fconr unit 217 and Fpla unit 227. Floatingpoint controller 387 decodes the incoming microinstruction sent fromFmicro unit 220 via lines 230 (shown in FIG. 4) and sends out bus sourceand destination information to the mantissa and exponent data paths.Floating point controller 387 generates control signals for basicarithmetic operations. Controller 387 also handshakes with control unit70 (of FIG. 1 ) by sending Fbusy and Ferror signals on lines 212 tocontrol unit 70. Fpla 227 is the sequencer used to execute allmulti-clock arithmetic operations. Fpla unit 227 has the algorithms todo all the basic arithmetic operations and some primitive operationsused by the microcode during exception handling.

Fconr unit 217 collects information and sends out control signals. Fconrunit 217 also sends signals, such as "Latch-OK," DECNEW, DECOLD, andDECUIL, to Fmicro unit 220 directing the latching and multiplexing ofincoming microcode fields in Fmicro unit 220. Fconr 217 generates Fbusyand Ferror signals on lines 212 and sends those signals to control unit70 of FIG. 1. The Fbusy and Ferror signals are decoded by control unit70 to determine whether to freeze microcode.

The Fbusy signal is generated by looking at the opcode field of incomingmicrocode to see if a floating point execute opcode is encoded that willtake more than one clock to execute. After the first clock of execution,the Fbusy signal remains high until the nanosequencer 227 reaches anidle state or a pre-execution assist or post-execution fault isdetected.

The Ferror signal is driven high and flagged to control unit 70 whenfloating point unit 78 needs a pre-execution assist or encounters apost-execution fault. Floating point unit 78 will cease all executionfrom that point on. Non-reversible operations, such as LMI and write tostack (which may have been issued along with the microinstruction thatcaused an error), will be aborted. Floating point unit 78 will resumeexecution only on a wake-up microinstruction or on a microprocessor chipreset.

Fconr unit 217 also generates a signal to the rest of floating pointunit 78 that indicates that it is okay to execute an operation. For anon-reversible operation, this signal is taken into account before goingmuch further than the decoding state.

In the last microinstruction for a particular macroinstruction, themicrocode can give the result destination, and the microcode can requesthardware to shadow the status and tag words and clear the trap register.In the case of pre-execution assist or post-execution fault, the latteroperations are aborted, and the trap register is left untouched.

Data movement inside floating point unit 78 of FIG. 2 is either directedby the microcode or by nanosequencer 227. The microcode will directtraffic in instances such as load/store operations, getting operands inthe correct register before issuing a FPU₋₋ EXECUTE ("FPU₋₋ EXEC")instruction, fixing operands in case of a assist, or any kind ofexception handling routine. Nanosequencer 227 has complete control aftermicrocode has issued a multi-clock FPU₋₋ EXEC instruction and floatingpoint unit 78 is ready to go busy. The FPU₋₋ EXEC microinstructionspecifies the operation to perform on data that has been previouslymoved to floating point unit 78.

At the same time data is transferred to its destination, the data ischecked to see if it is a special case--for example, infinity, not anumber, zero, or denormalized--and the flags in the trap register 367are set accordingly for Sreg 381 and OPA register 375. These flags maybe used when the FPU₋₋ EXEC microinstruction is issued.

Microcode must intervene to help floating-point unit 78 in performingoperations from time to time. This intervention is in the form of amicrocode assist or fault.

As stated in the background, exceptions are divided into two types ofproblems: (1) pre-execution assist and (2) post-execution fault.

Pre-execution assists occur if the operand value is not correct forthere to be a continuation of the particular operation--for example, theoperand is an unsupported number, not a number, infinity, or zero.

Post-execution faults occur in several different cases, as set forth asfollows:

(1) When the overflow condition is present and the operation is subjectto overflow.

(2) When the underflow condition is present and the operation is subjectto underflow.

(3) Stack overflow and underflow faults. Faults also occur when theerror summary bit is set in trap register 367.

In the case of either a pre-execution assist or a post-execution fault,the microcode will deal with the exception conditions and produce thecorrect result. When complaining to the microcode, the floating pointunit 78 does not distinguish between an assist and a fault. The floatingpoint unit 78 just raises the Ferror signal on lines 212 and stops allexecution from that point on. The microcode will service the fault atthe next sequential floating point microinstruction. Once the nextsequential floating point microinstruction is reached, the microcodefreezes, waiting for the Fbusy signal on line 212 to go away. Once theFbusy signal goes away and if Ferror is a logical high, then themicrocode jumps to the exception handler microinstructions. Duringexception handling, the first thing the microcode does is to issue awake-up microinstruction. After the wake-up microinstruction, thefloating point unit 78 will start execution of the exception handlerfloating point microinstructions. At the end of the exception handling,the microcode may issue an indirect execute instruction (also referredto as INDIRECT--EXECUTE), after which floating point unit 78 willreexecute the original floating point microinstruction which caused theerror to be raised. Alternatively, floating point unit 78 will executethe next sequential floating point microinstruction.

The method of handling exceptions associated with a floating pointinstruction is illustrated in FIG. 5. At step 270, floating pointmacroinstruction FI1 is decoded into one or more microinstructionaddresses that are read from control ROM 92 to the Fmicro unit 220 offloating point 78 for execution. Said execution typically takes morethan one clock cycle to complete.

During the execution of floating point microinstruction FI₁ by floatingpoint unit 78, Fconr unit 217 raises the Fbusy signal on line 212 to alogical high. During the execution of floating point instruction FI₁,Fconr unit 217 checks for the presence of a pre-execution assist or apost-execution fault. If a pre-execution assist is necessary or apost-execution fault occurs, the floating point unit does notdistinguish between an assist and a fault. Instead, Fconr 217 justraises the Ferror signal on line 212 to indicate to control unit 70 ofFIG. 2 that an error has occurred. The appropriate bits are also set intrap register 367 to indicate what type of error occurred. Control unit70 then can access trap register 367 to discover what error occurred.

Meanwhile, while floating point microinstruction FI₁ was being executedby floating-point unit 78, integer instructions I₂ and I₃, etc., werebeing executed by data bus unit 76 of microprocessor 60. In particular,at respective steps 272 and 274, integer microinstructions weretransferred from control ROM 92 of microprocessor 60 to data path unit76.

During the execution of floating point microinstruction FI₁, FI₁ waslatched into input latch 251 of Fmicro unit 220 illustrated in FIG. 4.The microinstruction FI1 was then transferred via lines 222, multiplexer253, lines 232, intermediate latch 255, and lines 234, 236, and 238 tobe latched into execution latch 257 and indirect execute latch 259.Thus, microinstruction FI1 was latched in execution latch 257 andindirect execute latch 259 of Fmicro unit 220. Said latching was doneunder the control of Fconr unit 217. A DECNEW signal sent by Fconr unit217 to multiplexer 253 caused multiplexer 253 to pass the output ofinput latch 251 to the input of intermediate latch 255. Fconr unit 217also sent a logical high "latch-OK" signal to gate 261 to permit thelatching of FI1 into indirect execute latch 259.

Once the Ferror signal is raised on line 212, floating point unit 78stops all execution of floating point microinstruction FI₁ from thatpoint on. The microcode of control unit 70 will service the fault at thenext sequential floating point microinstruction, which is floating pointmicroinstruction FI2 at step 280 shown in FIG. 5.

Once floating-point instruction FI₂ is reached at step 280, control unit70 of FIG. 1 looks to the Fbusy and Ferror signals on lines 212. If theFbusy and Ferror signals are both low, then floating point unit 78executes floating point microinstruction FI₂. If the Fbusy signal is lowand the Ferror signal is high, then control unit 70 jumps to theexception handler microinstructions of control ROM 92. If the Fbusy lineis high and the Ferror line is low, then floating point unit 78continues executing floating point microinstruction FI₁, but does notexecute floating point microinstruction FI₂ until instruction FI₁ iscompleted. If the Fbusy and Ferror signals are both high, this is a"don't care" condition because control unit 70 only looks at the Ferrorpin after the Fbusy pin goes low.

If the Ferror signal is high, the microcode will jump to the exceptionhandler microinstructions of control ROM 92. The jump to the floatingpoint exception handler microinstructions is shown as step 282 in FIG.5. The first thing the microcode does after jumping to the exceptionhandler microinstruction is to issue a wake-up microinstruction. Thejump to the exception handler microinstruction takes two clock cycles.

As soon as the Ferror signal goes high, latching into indirect executelatch 259 (of FIG. 4) is disabled. The disabling of indirect executelatch 259 is caused by a logical low "latch-OK" signal being sent togate 261 by Fconr unit 217. The resulting logical low output signal fromAND gate 261 is then applied to indirect execute latch 59 via line 248.This keeps indirect execute latch 259 from receiving the nextmicroinstruction. Again, the "latch OK" signal applied to gate 261 iscontrolled by Fconr unit 217, which also issues the Ferror signal onlines 212. Anytime there is a jump to the exception handler of controlROM 92, there will also be an Ferror signal on lines 212. Therefore,latching into indirect execute latch 259 will be disabled during theexecution of the exception handler microinstructions.

It follows, therefore, that wake-up microinstruction 284 will be storedonly in latch 257 of Fmicro unit 220, and not in indirect execute latch259. Instead, indirect execute latch 259 will retain a copy of floatingpoint microinstruction FI₁.

Latching into indirect execute latch 259 is resumed only if:

(1) A reexecute command is issued (also referred to as the INDIRECT₋₋EXECUTE command) or

(2) A NEWI microinstruction is issued. A NEWI microinstruction indicatesthe start of a new macroinstruction.

During the exception handler subroutine, the exception handler willissue microcode, such as floating point microinstruction f₁ at step 286of FIG. 5. Exception handler floating point microinstruction f₁ will bestored in latch 257 of Fmicro unit 220 and not in indirect execute latch259. This second level of latching allows the handling of floating pointexceptions. At the end of the exception handler microinstructions, therecould be an INDIRECT₋₋ EXECUTE command. If there is INDIRECT₋₋ EXECUTEcommand, floating point unit 78 will reexecute the instruction whichfaulted--namely, instruction FI₁ --before executing instruction FI₂.This is shown as step 290 in FIG. 5. Steps 272 and 274 are not repeatedand integer microinstructions i₂ and i₃ are not reexecuted. If there isno INDIRECT₋₋ EXECUTE command, then floating point microinstruction Fl₁is not reexecuted. Instead, the microcode simply goes forward andexecutes floating point microinstruction FI₂. FI₂ would be a NEWImicroinstruction if there is to be a start of a new macroinstruction.

Thus, by having both an execution latch 257 and an indirect executelatch 259, as shown in FIG. 4, floating point instructions can beexecuted in parallel with non-floating-point instructions, even thoughfloating point unit 78 shares control unit 70 with microprocessor 60. Inaddition, exceptions associated with a first floating point instructioncan be handled upon reaching the next sequential floating pointinstruction.

In a preferred embodiment of the present invention, there cannot be anexception situation within an exception situation. Microcode will notissue any microinstruction which can cause an error during the exceptioncondition.

Reference is made to FIG. 6, which is a flow chart of operation of thepreferred embodiment.

As illustrated beginning in a box 400, the next microinstruction isobtained from the decoder 68 and the control ROM 92. Decodingmacroinstructions into a group of microinstructions is conventional. Itshould be understood that each macroinstruction decoded in the decoder68 points to one or more microinstructions in control ROM 92. Forexample a single floating point macroinstruction may point to a seriesof five floating point microinstructions, the next macroinstruction maybe an integer macroinstruction that points to two integermicroinstructions, the subsequent macroinstruction may be anotherinteger macroinstruction that points to three integer microinstructions,and the next macroinstruction may be a floating point macroinstructionthat points to eight microinstructions.

After the box 400 in which a microinstruction has been provided from thecontrol ROM 92, the flowchart in FIG. 6 indicates that the next step isa decision, in a decision box 402, as to whether or not themicroinstruction is a floating point instruction. If it is not afloating point instruction, then as illustrated in a box 404, themicroinstruction is provided to the execution unit 76 where it isexecuted appropriately. However, if the microinstruction is a floatingpoint instruction or a floating point exception microinstruction, then,as illustrated in a decision box 406, the signal F_(busy) is checked tosee if it is on or off. The F_(busy) signal is generated in FP controlunit 217 as described with reference to FIG. 2 and FIG. 5, which looksat the incoming microcode to see if a floating point execute opcode isencoded that will take more than one clock. If so, the F_(busy) signalremains high until the FP nanosequencer unit 227 reaches an idle stateor a pre-execution assist or post-execution fault is detected.

If in the decision box 406, F_(busy) indicates that the floating pointprocessor 78 is currently in use then the microprocessor simply waits asillustrated in a time out box 408 until the floating point processor 78becomes available and F_(busy) no longer signals it as active. After theF_(busy) signal is deasserted, operation proceeds to a decision box 410in which the signal F_(error) is checked to determine whether or not anerror was noted in the previous floating point instruction. The Ferrorsignal is generated in the FP control unit 217 (FIG. 2), and is assertedwhen the floating point unit 78 needs a pre-execution assist orencounters a post-execution fault (i.e., when it encounters anexception).

If F_(error) is asserted, the decision box 410 branches to the exceptionhandler microinstructions, as illustrated beginning in a box 412, inwhich a wake-up microinstruction is issued which wakes up the floatingpoint unit. After the wake-up instruction in the box 412, the floatingpoint unit 78 begins execution of the selected exception handlermicroinstructions, as illustrated in a box 414. Upon completion of theexception handler microinstructions, a decision is made regardingre-execution of the exception-causing microinstruction as illustrated inthe decision box 416. If re-execution is desired, then themicroinstruction is re-executed as illustrated in a box 418 beforecontinuing on with execution of the microinstruction awaiting conclusionof handling the exception. This decision is discussed above, wherein itis stated that an indirect execute instruction may be issued by themicrocode to re-execute the original microinstruction, and alternately,floating point unit 78 will execute the next sequential floating pointmicroinstruction.

Operation will proceed to an execution box 420 through either thedecision box 410 directly if no error was noted from the previousfloating point microinstruction (i.e. Ferror was not asserted), orthrough the boxes 412, 414, 416, and 418 following exception handling.As illustrated in the box 420, the floating point microinstructionbegins execution, and operation returns to the box 400 to get the nextmicroinstruction. However, during execution of the floating pointmicroinstruction, operation also branches in parallel to a box 430 inwhich the F_(busy) signal is asserted to signify that the floating pointunit 78 is busy. After assertion of the F_(busy) signal, executioncontinues as illustrated in a decision box 432. If 432. If execution iscompleted without an exception being detected, then, as illustrated inthe box 434, the F_(busy) signal will be de-asserted and the floatingpoint unit 78 stands ready to execute the next floating pointinstruction, as illustrated in the box 436. However, if an exception isdetected, as illustrated in a decision box 438, then execution of themicroprocessor is halted as illustrated in a box 440, and the Ferrorsignal is asserted as illustrated in a box 442. Then, the floating pointunit 78 awaits handling of the error as discussed above and illustratedin the decision box 410.

Latching into a first and second latch is specified in the box 420, inwhich the microinstruction is stored in a first and a second latch forexecution, as discussed above with reference to FIG. 4. As illustratedin the box 414, the exception handler microinstructions are executedusing only the first latch, therefore leaving the microinstruction inthe second latch untouched. After completion of the exception handling,the microinstruction in the second latch may be executed, as illustratedin the box 418.

Appendix 1 sets forth a Linpack inner loop. Linpack inner loops arerelatively common matrix equations. Linpack inner loops are used asbenchmarks for comparing the floating point performance ofmicroprocessors. The Linpack inner loop of Appendix 1 includes bothfloating point instructions and non-floating point instructions. In theLinpack inner loop of Appendix 1, dummy reads ("dummy rds") have beenadded to avoid cache misses. The dummy reads have been added at pointswhere floating point instructions are being executed in parallel withnon-floating point instructions, so the addition of the dummy reads doesnot cost any execution clock cycles. Parallelism alone leads toperformance gains, but the addition of the dummy reads shows that onecan add instructions to take further advantage of the parallelism offloating point/non-floating point instruction execution in order toachieve further performance gains. Indeed, with parallel execution offloating point and non-floating instructions during execution of theLinpack inner loop of Appendix 1, there is an overall improvement inperformance of the microprocessor of approximately 20 percent overnon-parallel operation.

In the foregoing specification, the invention has been described withreference to specific exemplary embodiments thereof. It will, however,be evident that various modifications and changes may be made theretowithout departing from the broader spirit and scope of the invention asset forth in the appended claims. The specification and drawings are,accordingly, to be regarded in an illustrative rather than a restrictivesense.

    ______________________________________                                        APPENDIX 1                                                                    Y[i]:= Y[i] + C*[i]                                                           Loop1:                                                                        ______________________________________                                        fstp   dword ptr y[ecx]  ;y[i - 1]                                            Fld    st                ;duplicate constant                                  Fmul   dword ptr x[ecx + 8]                                                                            ;X[i]                                                mov    eax, dword ptr y[ecx + 8]                                                                       ;dummy rd, cache miss                                add    ecx, 8            ;i = i + 1                                           mov    eax, dword ptr x[ecx + 8]                                                                       ;dummy rd, cache miss                                Fadd   dword ptr y[ecx]  ;Y[i]                                                cmp    ecx, ebx          ;limit check                                         jne    loop1                                                                  ______________________________________                                    

What is claimed is:
 1. A method of processing instructions in apipelined processor including parallel processing units and handlingexceptions that arise during execution of a first type ofmicroinstruction in a first processing path operable in parallel with asecond processing path that executes a second type of microinstruction,said microinstructions being supplied from a decoder and control unitthat receives a stream of intermixed macroinstructions including firsttype and second type macroinstructions, and responsive to each of saidmacroinstructions outputs one or more microinstructions from a controlROM to a microinstruction bus sequentially in an order determined by themacroinstruction sequence, so that said decoder and control unit outputthe first type of microinstruction intermixed with the second type ofmicroinstruction, said first processing path including amicroinstruction selector having an output coupled to supply amicroinstruction to an execution unit in said first processing path, afirst latch, and a second latch, said microinstruction selector having afirst input coupled to the microinstruction bus, a second input coupledto said first latch, and a third input coupled to said second latch,said instruction processing and exception handling method comprising thesteps of:(a) issuing a microinstruction from the control ROM, saidmicroinstruction being one of the first type and the second type ofmicroinstruction; (b) if said microinstruction from the step (a) is ofthe first type, then selecting said first input, processing saidmicroinstruction in the first processing path, and storing saidmicroinstruction in the first latch and the second latch; (c) if saidmicroinstruction from the step (a) is of the second type, thenprocessing said microinstruction in the second processing path; (d)repeating steps (a), (b), and (c), until an exception occurs resultingfrom processing a microinstruction in the first processing path, andthen moving to step (e); (e) if an exception occurs resulting fromexecution of an exception-causing microinstruction of the first type,then halting execution in said first processing path and repeating steps(a) through (d) and continuing to process any microinstructions that maybe provided to the second processing path until in step (b), a nextmicroinstruction of the first type is ready to be provided from thecontrol ROM to the first processing path, and then handling theexception by accessing exception handler microinstructions stored in thecontrol ROM in the control unit and selecting said first input to supplysaid exception handler microinstructions to the first processing path,and sequentially storing said exception handler microinstructions in thefirst latch but not the second latch and if handling the exceptionincludes re-executing the exception-causing instruction, then selectingsaid third input to supply said exception-causing instruction from thesecond latch and re-executing the exception-causing microinstruction ofthe first type; and (f) after handling the exception in the firstprocessing path in the step (e), then providing said nextmicroinstruction of the first type to the first processing path,executing said next microinstruction of the first type, and repeatingsteps (a)-(d).
 2. The method of claim 1 wherein the step (b) furthercomprises the sub-steps of:(b)(1) latching said microinstruction of thefirst type into an input latch that has an output coupled to the firstinput in the first processing path; (b)(2) selecting said first input tosupply said microinstruction of the first type from said selector to acontrol unit and a decoder in the first processing path and latchingsaid microinstruction from said multiplexer output into an intermediatelatch; (b)(3) latching said microinstruction of the first type from theintermediate latch into the first latch and the second latch; and (b)(4)while executing said microinstruction of the first type, selecting saidsecond input to supply said microinstruction from the first latch to thecontrol unit and decoder for the processor of the first type.
 3. Themethod of claim 2, wherein the step (e), handling the exception furthercomprises the sub-steps of:(e)(1) in a first clock cycle, latching saidexception handler microinstruction into the input latch in the firstprocessing path; (e)(2) in a second clock cycle, selecting said firstinput to supply said exception handler microinstruction to the controlunit and decoder in the first processing path and latching saidexception handler microinstruction into the intermediate latch; (e)(3)in a third clock cycle, latching said exception handler microinstructionfrom the intermediate latch into the first latch but not the secondlatch; and (e)(4) in subsequent clock cycles, while executing saidexception handler microinstruction, selecting said second input supplysaid exception handler microinstruction from the first latch to thecontrol unit and decoder for the first processing path.
 4. The method ofclaim 1 wherein, while processing said microinstruction in the firstprocessing path in the step (b), selecting said second input to supplysaid microinstruction from said first latch to said first processingpath.
 5. The method of claim 1 wherein said selector includes amultiplexer circuit, and selecting said first input comprises selectinga first multiplexer input, selecting said second input comprisesselecting a second multiplexer input, and selecting said third inputcomprises selecting a third multiplexer input.
 6. A method of processinginstructions in a pipelined processor including parallel processingunits and handling exceptions that arise during execution of a firsttype of microinstruction in a first processing path operable in parallelwith a second processing path that executes a second type ofmicroinstruction, said microinstructions being supplied from a decoderand control unit that receives a stream of intermixed macroinstructionsincluding first type and second type macroinstructions, and responsiveto each of said macroinstructions outputs one or more microinstructionsfrom a control ROM sequentially in an order determined by themacroinstruction sequence, so that said decoder and control unit outputthe first type of microinstruction intermixed with the second type ofmicroinstruction, said instruction processing and exception handlingmethod comprising the steps of:(a) issuing a microinstruction from thecontrol ROM, said microinstruction being one of the first type and thesecond type of instruction; (b) if said microinstruction from the step(a) is of the first type, then processing said microinstruction in thefirst processing path, and storing the microinstruction of the firsttype in a first latch and a second latch during execution of themicroinstruction of the first type, said processing of saidmicroinstruction including the steps of:(b)(1) in a first clock cycle,latching said microinstruction of the first type into an input latch inthe first processing path, (b)(2) in a second clock cycle, supplyingsaid microinstruction of the first type to a control unit and a decoderin the first processing path and latching said microinstruction from theinput latch into an intermediate latch, (b)(3) in a third clock cycle,latching said microinstruction of the first type from the intermediatelatch into the first latch and the second latch, and (b)(4) insubsequent clock cycles, while executing said microinstruction of thefirst type, supplying said microinstruction from the first latch to thecontrol unit and decoder for the processor of the first type; (c) ifsaid microinstruction from the step (a) is of the second type, thenprocessing said microinstruction in the second processing path; (d)repeating steps (a), (b), and (c), until an exception occurs resultingfrom processing a microinstruction in the first processing path, andthen moving to step (e); (e) if an exception occurs resulting fromexecution of an exception-causing microinstruction of the first type,then halting execution in said first processing path and repeating steps(a) through (d) and continuing to process any microinstructions that maybe provided to the second processing path until in step (b), a nextmicroinstruction of the first type is ready to be provided from thecontrol ROM to the first processing path, and then handling theexception by accessing exception handler microinstructions stored in thecontrol ROM in the control unit and supplying said exception handlermicroinstructions to the first processing path, wherein said handling ofthe exception includes the steps of:(e)(1) latching said exceptionhandler microinstruction into the input latch, (e)(2) supplying saidexception handler microinstruction from the input latch to a first inputof a multiplexer that is coupled to the control unit and decoder in thefirst processing path and latching said exception handlermicroinstruction into the intermediate latch, and applying saidexception handler microinstruction from said input latch to a firstinput of a multiplexer that supplies an output to the control unit anddecoder for the first processing path, said output also being providedto the intermediate latch, and selecting said first input to providesaid exception handler microinstruction as the multiplexer output,(e)(3) latching said exception handler microinstruction from theintermediate latch into the first latch but not the second latch, (e)(4)while executing said exception handler microinstruction, supplying saidexception handler microinstruction from the first latch to the controlunit and decoder for the first processing path, and applying saidexception handler microinstruction from said first latch to a secondinput of the multiplexer, and selecting said second input to provide themultiplexer output; (e)(5) if the exception-causing microinstruction isto be re-executed, applying the second latch to a third input of themultiplexer, and selecting said third input to provide the multiplexeroutput; and (f) after handling the exception in the first processingpath in the step (e), then providing said next microinstruction of thefirst type to the first processing path, executing said nextmicroinstruction of the first type, and repeating steps (a)-(d).